Fundamentals of Machine Learning for Predictive Data Analytics: Algorithms, Worked Examples, and Case Studies by JOHN D. KELLEHER & Brian Mac Namee & Aoife D'Arcy

Fundamentals of Machine Learning for Predictive Data Analytics: Algorithms, Worked Examples, and Case Studies by JOHN D. KELLEHER & Brian Mac Namee & Aoife D'Arcy

Author:JOHN D. KELLEHER & Brian Mac Namee & Aoife D'Arcy
Language: eng
Format: mobi
ISBN: 9780262029445
Publisher: MIT Press
Published: 2015-07-30T21:00:00+00:00


where denotes the network graph, D is the training data, is the set of entries in the CPTs of , d is the number of parameters of (i.e., how many entries in the CPTs of ), and n is the number of instances in D. This metric contains a term describing how well the model predicts the data P(D|, ) as well as a term that punishes complex models . As such, it balances the search goals of model accuracy and simplicity. The term P(D|, ) can be computed using metrics such as the Bayesian score or the K2 score23. The search space for these algorithms is exponential in the number of features. Consequently, developing algorithms to learn the structure of Bayesian networks is an ongoing research challenge.24

It is much simpler to construct a Bayesian network using a hybrid approach, where the topology of the network is given to the learning algorithm, and the learning task involves inducing the CPT entries from the data. This type of learning illustrates one of the real strengths of the Bayesian network framework, namely, that it provides an approach to learning that naturally accommodates human expert information. In this instance, the human expert specifies that topology of the network, and the learning algorithm induces the CPT entries for nodes in the topology in the same way that we computed the conditional probabilities for the naive Bayes model.25

Given that there are multiple Bayesian networks for any domain, an obvious question to ask is what is the best topological structure to give the algorithm as input? Ideally, we would like to use the network whose structure most accurately reflects the causal relationships in the domain. Specifically, if the value of one feature directly influences, or causes, the value taken by another feature, then this should be reflected in the structure of the graph by having a link from the cause feature to the effect feature. Bayesian networks whose topological structure correctly reflects the causal relationships between the features in a dataset are called causal graphs. There are two advantages to using a causal graph: (1) people find it relatively easy to think in terms of causal relationships, and as a result, networks that encode these relationships are relatively easy to understand; (2) often networks that reflect the causal structure of a domain are more compact in terms of the number of links between nodes and hence are more compact with respect to the number of CPT entries.

We will use an example from social science to illustrate how to construct a causal graph using this hybrid approach. In this example, we will build a Bayesian network that enables us to predict the level of corruption in a country based on a number of macro-economic and social descriptive features. Table 6.18[305] lists some countries described using the following features26



Download



Copyright Disclaimer:
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.